76 research outputs found
Dynamic Information Flow Tracking on Multicores
Dynamic Information Flow Tracking (DIFT) is a promising technique for detecting software attacks. Due to the computationally intensive nature of the technique, prior efficient implementations [21, 6] rely on specialized hardware support whose only purpose is to enable DIFT. Alternatively, prior software implementations are either too slow [17, 15] resulting in execution time increases as much as four fold for SPEC integer programs or they are not transparent [31] requiring source code modifications. In this paper, we propose the use of chip multiprocessors (CMP) to perform DIFT transparently and efficiently. We spawn a helper thread that is scheduled on a separate core and is only responsible for performing information flow tracking operations. This entails the communication of registers and flags between the main and helper threads. We explore software (shared memory) and hardware (dedicated interconnect) approaches to enable this communication. Finally, we propose a novel application of the DIFT infrastructure where, in addition to the detection of the software attack, DIFT assists in the process of identifying the cause of the bug in the code that enabled the exploit in the first place. We conducted detailed simulations to evaluate the overhead for performing DIFT and found that to be 48 % for SPEC integer programs
Recommended from our members
Parallel simplex algorithms and loop spreading
Parallel solutions for two classes of linear programs are
presented. First we parallelized the two-phase revised simplex
algorithm and showed that it is possible to get linear improvement in
performance. The simplex algorithm is the best known algorithm for
solving linear programs, and we claim our result is the best one
which can be achieved.
Next we study the parallelization of the decomposed simplex
algorithm. One of our new parallel algorithms has achieved 2*P time
of performance improvement over the decomposed simplex
algorithm using P processors. Meanwhile, we discovered a particular
variation of the decomposed simplex algorithm which can run 2
times faster than the original one. The new parallel algorithm
linearly speedups the fast sequential algorithm.
As in any parallel program, unbalanced processor load causes
the performance of the parallel decomposed simplex algorithm to
drop significantly when the size of the input data is not a multiple of
the number of available processors. To remove this limitation, we
invented a load balance technique called Loop Spreading that evenly
distributes parallel tasks on multiple processors without a drop in
performance even when the size of the input data is not a multiple of
the number of processors. Loop Spreading is a general technique
that can be used automatically by a compiler to balance processor
load in any language that supports parallel loop constructs
Recommended from our members
Parallelism encapsulation in C++
Object oriented programming features information hiding and encapsulation, meaning that 1) each object hides the the implementation details tram access from outside and only a set of methods (interface routines) are visible outside of the object, and 2) changes to the implementation of the object do not require changes to the code that uses the object, so long as the interface is stable. However, the interface mechanism in C++ is not adequate to achieve information hiding and encapsulation when writing parallel C++ programs, since the methods are assumed to be invoked in sequence and no parallel interactions are represented by them. Also, even when the methods are the same, changes to the implementation details of the methods often affect the interaction pattern of the methods so the parallel code that uses the methods must be rewitten. To achieve information hiding and encapsulation, we propose adding path expressions to the class interface. Thus either dynamic or automatic parallelization can be used to achieve parallelism encapsulation. A new concept of data dependence analysis is introduced which uses the parallelism described by path expressions to efficiently and automatically parallelize an object-oriented program
Recommended from our members
Parallel algorithms for decomposed linear programs
New parallel algorithms for solving the decomposed linear programs are developed. Direct parallelization of the sequential algorithm results in very limited performance improvement using multiple processors. By redesigning the algorithm, we achieved more than 2*P times performance improvement over the sequential algorithm, where P is the number of processors used in parallel computation. Furthermore, a particular variation of the sequential algorithm runs more than 2 times faster than the original sequential algorithm. The new parallel algorithm linearly speedups the new sequential algorithm
Recommended from our members
Parallelizing WHILE loops
Two methods for parallelizing WHILE loops are presented. The first method converts a WHILE loop into a FORALL construct, and the second method pipelines a WHILE loop. Each of the methods is based on a transformation that makes explicit the loop counting. Also, we propose two parallel WHILE constructs
Recommended from our members
East Asian hydroclimate modulated by the position of the westerlies during Termination I.
Speleothem oxygen isotope records have revolutionized our understanding of the paleo East Asian monsoon, yet there is fundamental disagreement on what they represent in terms of the hydroclimate changes. We report a multiproxy speleothem record of monsoon evolution during the last deglaciation from the middle Yangtze region, which indicates a wetter central eastern China during North Atlantic cooling episodes, despite the oxygen isotopic record suggesting a weaker monsoon. We show that this apparent contradiction can be resolved if the changes are interpreted as a lengthening of the Meiyu rains and shortened post-Meiyu stage, in accordance with a recent hypothesis. Model simulations support this interpretation and further reveal the role of the westerlies in communicating the North Atlantic influence to the East Asian climate
Static branch frequency and program profile analysis
Abstract: Program profiles identify frequently executed portions of a program, which are the places at which optimizations offer programmers and compilers the greatest benefit. Compilers, however, infrequently exploit program profiles, because profiling a program requires a programmer to instrument and run the program. An attractive alternative is for the compiler to statically estimate program profiles.. This paper presents several new techniques for static branch prediction and profiling. The first technique combines multiple predictions of a branchÕs outcome into a prediction of the probability that the branch is taken. Another technique uses these predictions to estimate the relative execution frequency (i.e., profile) of basic blocks and controlflow edges within a procedure. A third algorithm uses local frequency estimates to predict the global frequency of calls, procedure invocations, and basic block and control-flow edge executions. Experiments on the SPEC92 integer benchmarks and Unix applications show that the frequently executed blocks, edges, and functions identified by our techniques closely match those in a dynamic profile
Static Branch Frequency and Program Profile Analysis
Program profiles identify frequently executed portions of a program, which are the places at which optimizations offer programmers and compilers the greatest benefit. Compilers, however, infrequently exploit program profiles, because, profiling a program requires a programmer to instrument and run the program. An attractive alternative is for the complier to statically estimate program profiles. This paper presents several new techniques for static branch prediction and profiling. The first technique combines multiple predictions of a branch's outcome into a prediction of the probability that the branch is taken. Another technique uses these predictions to estimate the relative execution frequency (i.e., profile) of basic blocks and control-flow edges within a procedure. A third algorithm uses local frequency estimates to predict the global frequency of calls, procedure invocations, and basic block and control-flow edge executions. Experiments on the SPEC92 integer benchmarks and Unix applications show that the frequently executed blocks, edges, and functions identified by our techniques closely match those in a dynamic profile
Static Branch Frequency and Program Profile Analysis
: Program profiles identify frequently executed portions of a program, which are the places at which optimizations offer programmers and compilers the greatest benefit. Compilers, however, infrequently exploit program profiles, because profiling a program requires a programmer to instrument and run the program. An attractive alternative is for the compiler to statically estimate program profiles. . This paper presents several new techniques for static branch prediction and profiling. The first technique combines multiple predictions of a branch's outcome into a prediction of the probability that the branch is taken. Another technique uses these predictions to estimate the relative execution frequency (i.e., profile) of basic blocks and controlflow edges within a procedure. A third algorithm uses local frequency estimates to predict the global frequency of calls, procedure invocations, and basic block and control-flow edge executions. Experiments on the SPEC92 integer benchmarks and Uni..
- …